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ABSTRACT 



A speech recognition apparatus and method for use in a car 
navigation system performs speech processing for recogniz- 
ing speech or spoken words referring to a specified region. 
An input audio signal or vocalized speech undergoes speech 
processing to determine and recognize the region specified 
in the speech. Data corresponding to the specified region is 
converted to coordinate position data for the region, and a 
map of the vicinity of the converted coordinate position data 
is displayed. When a new audio signal is input during speech 
recognition processing of a previously-input audio signal, 
the processing of the previously-input audio signal is inter- 
rupted and the new audio signal undergoes speech recogni- 
tion processing. Accordingly, a high-efficiency operation of 
the car navigation system may be performed without inter- 
fering with the driving of the car. The speech recognition and 
apparatus also determines whether an input audio signal has 
been re-input within a predetermined amount of time from 
when the audio signal was previously input. 

14 Claims, 11 Drawing Sheets 
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FIG. 6 
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FIG. 10 
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INTERRUPT CORRECTION OF SPEECH SUMMARY OF THE INVENTION 

RECOGNITION FOR A NAVIGATION , Q oom|dentioil of such , problem> ^ object of the 

present invention is to be able to simply perform complex 

BACKGROUND OF THE INVENTION 5 op 01 * 1 * 005 °f various kinds of apparatuses such as a navi- 
gation apparatus, etc. without obstructing the driving of a 

1. Field of the Invention car, etc. 
The present invention relates to a speech recognition 

apparatus and a speech recognition method suitably applied BRIEF DESCRIPTION OF THE DRAWINGS 

to a navigation apparatus mounted to e.g., a car displaying 10 rg. 1 is a perspective view showing a state in which an 

a road map, etc., the navigation apparatus and a navigation apparatus ^ one embodiment of the present invention is 

method combined with this voice recognition apparatus, and assembled into a car 

a car for mounting these apparatuses thereon. „ T _ _ . . 

° r FIG. 2 is a perspective view snowing a portion near a 

2. Description of the Prior Art drivef m{ when me apparatus in one embodiment is 

Various kinds of navigation apparatuses mounted onto a 15 assembled into the car. 

car, etc. have been developed. Each of these navigation RG. 3 is a schematic in block diagram form showing one 

apparatuses is constructed by a large capacity data memory embo diment of the present invention, 

means such as a CD-ROM storing e.g., road map data, a _._ . . . 

detecting means for detecting the present position of the car, f 4 15 a f a S ram s K howin S a memorv ™ construction 

and a displaying apparatus for displaying a road map in the 20 °* a ™nory for speech recognition in one embodiment, 

vicinity of the detected present position on the basis of data nG - 5 1S a diagram showing a memory area construction 

read from the data memory means. In this case, the detecting of a memory for longitude and latitude conversion in one 

means of the present position is constructed by using a embodiment. 

position measuring system using an artificial satellite for a FIG. 6 is a flow chart showing processing by speech 

position measurement called a GPS (Global Positioning 25 recognition in one embodiment. 

System), a self-contained navigation following up a change FIG. 7 is a flow chart showing display processing in a 

in the present position from a starting spot point on the basis navigation apparatus in one embodiment, 

of information such as a vehicle's running direction, a FIG. 8 A is a flow chart showing processing from a speech 

vehicle s running speed, etc. mpu , , 0 , map display m 0M embodiment . 

A map displayed in the displaying apparatus is set such FIG. 8B is a flow chart showing processing at a reexecut- 

that a map in a desirable position can be displayed as well m tim6 of h r6COgnition m one 6mbo diment. 

as the present position by performing a key operation, etc. as ■ , - ■ , 

long as map data are prepared. F1< ?- 9 15 » . flow chart sh ™»« P™*ssi"g w ^n the 

...... r speech recognition in one embodiment is executed plural 

In the case of such a navigation apparatus, for example, in ^ times 

the case of the navigation apparatus for a car, the displaying . . . , „ 

apparamsisgenerallyarrangedinthevicinityofadriverseat : 10 f 15 ^ explanatory view showing a display 

such that a driver can see a map in the vicinity of the present exam P le of a candldate ^ * °™ embodiment, 

position while the car is running and temporarily stops as in FIG - 11 is a flow chart showing a processing example at 

traffic signal stoppage, etc. a deleting ume of temporary list of recognized speech in one 

It is necessary to be able to operate such a navigation embodiment, 
apparatus such that no navigation apparatus obstructs driv- DESCRIPTION OF THE PREFERRED 
mg of the car, etc. For example, the navigation apparatus is EMBODIMENTS 
constructed such that a complicated operation of the navi- 
gation apparatus is inhibited during the car driving. Namely, 45 One embodiment of the present invention will next be 
when such a navigation apparatus is arranged in a vehicle, described with reference to the accompanying drawings, 
the navigation apparatus is connected to a certain running i n this example, the present invention is applied to a 
state detecting section (e.g., a parking brake switch of the navigation apparatus mounted to a car. An arranging state of 
car). The navigation apparatus is set such that all operations me navigation apparatus mounted to the car in this example 
of the navigation apparatus can be performed only when 50 will first be explained with reference to FIGS. 1 and 2. As 
stoppage of the vehicle is detected by this running state shown in FIG. 1, a steering wheel 51 of the car SO is attached 
detecting section, and a complicated key operation is inhib- t 0 a front portion of a driver seat 52 and a driver sitting on 
ited in a nonstopping state (namely, during running of the the driver seat 52 basically operates the navigation appara- 
vehicle). tus. However, there is also a case in which another fellow 

However, it is inconvenient that no operation for switch- 55 passenger within this car 50 operates the navigation appa- 

ing display maps, etc. can be performed during such run- ratus. A body 20 of this navigation apparatus and a speech 

Ding. Accordingly, it is required that a high grade operation recognition apparatus 10 connected to this navigation appa- 

of the navigation apparatus can be performed without ratus body 20 are arranged in an arbitrary space (e.g., within 

obstructing the driving of the vehicle even when the vehicle a trunk of a rear portion) within the car 50. An antenna 21 

is running. 60 for receiving a position measuring signal described later is 

In such a case, it is considered that, for example, various attached onto the outer side of a car body (otherwise, within 

kinds of commands are inputted by a speech input to operate the car such as the inner side of a rear window, etc.). 

the navigation apparatus. However, when incorrect com- As shown in the vicinity of the driver seat in FIG. 2, a talk 

mands are inputted through speech, etc., it is necessary to switch 18 and an operation key 27 of the navigation appa- 

perform an operation for canceling the input by a key 65 ratus described later are arranged on a side of the steering 

operation, etc. making it inconvenient to handle the navi- wheel 51 such that the talk switch 18 and the operation key 

gation apparatus. 27 are operated without causing any obstruction during 
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driving of the car. A displaying apparatus 40 connected to For example, in the case of the interior of the country of 

the navigation apparatus is also arranged in a position in Japan, the number of cities, wards, towns and villages in the 

which no field of view in front of the driver is obstructed. A whole country is about 3500 so that about 3500 place names 

speaker 32 for outputting an audio signal synthesized as are stored to the memory area. However, in the case of the 

speech within the navigation apparatus 20 is attached to the 5 place name of "xx town", both data showing the pronounc- 

car in a position in which an output speech reaches the driver ing case of "xx machi" and data showing the pronouncing 

(e.g., on a side of the displaying apparatus 40, etc.). case of "xx cho" are stored. Similarly, in the case of the place 

Speech can be inputted to the navigation apparatus in this name °f " xx village", both data showing the pronouncing 

example. Therefore, a microphone 11 is attached to a sun case of " xx son" and data showing the pronouncing case of 

visor 53 arranged in an upper portion of a front glass in front 10 ' <xx mura" are stored. 

of the driver seat 52 so that the microphone 11 collects the The names of urban and rural prefectures tending to be 

speech of the driver sitting on the driver seat 52. mistaken are additionally registered with respect to the 

The navigation apparatus body 20 in this example is names of cities, wards, towns and villages having a high 

connected to a computer 54 for controlling the operation of possibility that the names of urban and rural prefectures are 

an engine of this car so that a pulse signal proportional to a 15 incorrectly remembered such as cities, wards, towns and 

car speed is supplied from this computer 54 to the navigation villages, etc. adjacent to boundaries of the urban and rural 

apparatus body 20. prefectures in position. Namely, for example, "Kawasaki 

An internal construction of the navigation apparatus in citv ' Kanagawa prefecture" as a correct example is regis- 

this example will next be explained with reference to FIG. tered and "Kawasaki city, Tokyo Metropolis" as an incorrect 

3. In this example, the speech recognition apparatus 10 is example providing an adjacent name of each of the urban 

connected to the navigation apparatus 20 and the micro- and rural Prefectures is also registered, 

phone 11 is connected to the speech recognition apparatus Character codes of words for giving commands of various 

10. For example, directivity of this microphone 11 is set to kinds of operations such as words designating display posi- 

be relatively narrow and the microphone 11 is constructed tions such as "destination", "starting spot", "routing spot", 

such that only speech of a person sitting on the driver seat 25 "one's own house", etc., "what time now" (a command for 

of the car is preferably collected. For example, a power hearing the present time), "where now" (a command for 

source of the navigation apparatus is turned on to collect the hearing the present position), "next" (a command for hearing 

speech only while a talk switch 18 described later is pushed the next intersection), "how far from here" (a command for 

and turned on. hearing a distance until the destination), "speed" (a com- 

An audio signal collected by this microphone 11 is 3 ° mand for hearing the present speed), "altitude" (a command 

supplied to an analog/digital converter 12 and is converted for hearing the present altitude), "advancing direction" (a 

to a digital audio signal of a predetermined sampling fre- command for hearing an advancing direction), "list" (a 

quency. Then, the digital audio signal outputted from this command for displaying a list of recognizable commands in 

analog/digital converter 12 is supplied to a digital speech 35 tne displaying apparatus), etc., and others are stored as 

processing circuit 13 constructed by an integrated circuit words for designating the operation of the navigation appa- 

called a DSP (digital signal processor). In this digital speech ratus * Further, a phonemic model corresponding to each of 

processing circuit 13, the digital speech signal is set to ^ese words is also stored. 

vector data by processing such as band division, filtering, When a character code corresponding to a phonemic 

etc., and these vector data are supplied to a speech recog- ^ model and conforming to recognized results obtained 

nition circuit 14. through a predetermined speech recognizing algorithm from 

A ROM 15 for storing speech recognition data is con- ^put vector data is the character code of a place name in the 

nected to this speech recognition circuit 14 so that a recog- speech recognition circuit 14, this character code is read 

nizing operation is performed in accordance with a prede- fr° m me ROM 15. This read character code is supplied to a 

termined voice recognizing algorithm (e.g., HMM: hidden 45 longitude latitude converting circuit 16. A ROM 17 for 

Markov model) with respect to the vector data supplied from storing longitude latitude converted data is connected to this 

the digital speech processing circuit 13. This ROM 15 then longitude latitude converting circuit 16. Longitude and 

selects plural candidates from phonemic models for speech latitude data corresponding to the character data supplied 

recognition stored to the ROM 15 and reads character data fr ° m ^ c speech recognition circuit 14 and their accompa- 

stored in accordance with a phonemic model having a 50 D > ria 8 data arc read from ^ R0M 11 ' 

highest conformity degree among these candidates. The The speech recognition apparatus 14 in this example has 

speech recognition circuit 14 in this example also functions a temporary memory 14a for temporarily storing recognized 

as a control means for controlling processing of each portion results in a temporary list of recognized speech within this 

within the speech recognition apparatus 10 and judges an temporary memory. Data up to speech conforming to a 

operation of the talk switch 18 described later. 55 certain degree in an order from speech of a highest confor- 

Here, a data storing state of the ROM 15 for storing the m ity degree are also stored as a candidate list at a recogni- 

speech recognizing data in this example will be explained. uon processing time. The temporary list of recognized 

In the case of this example, only the name of a place and a speech and the candidate list are erased when a certain time 

word for designating an operation of the navigation appa- nas passed since these lists were stored, 

ratus are recognized. As shown in a setting state of a memory 60 Here, a data storing state of the ROM 17 for storing 

area in FIG. 4, only the names of domestic urban and rural longitude latitude converted data in this example will be 

prefectures, cities, wards, towns and villages are registered explained. In the case of this example, a memory area is set 

as the name of a place. A character code of this place name using the same character code as the character code of a 

and a phonemic model as data for recognizing the place place name stored in the ROM 15 for storing speech 

name as speech are stored to the memory area every each of 65 recognizing data. As shown in FIG. 5, latitude and longitude 

the urban and rural prefectures, cities, wards, towns and data of a place name shown by characters and data of a 

villages. display scale as accompanying data are stored every char- 
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acter code. The character code read from the ROM 15 for Various kinds of control data except for the above- 
storing speech recognizing data is set to a character code by mentioned character codes can be also transmitted to the 
a katakana (the square form of kana in Japanese). Character navigation apparatus 20 through the terminal 10b from the 
codes using Chinese characters for display, a hiragana (the speech recognition circuit 14 within the speech recognition 
cursive kana characters in Japanese), a katakana, etc. are 5 apparatus 10 in this example. For example, there is a case in 
also stored to this ROM 17 for storing longitude latitude which control data for interrupting speech output processing 
converting data in addition to the character code by a and processing for making map data are transmitted to the 
katakana in which a pronunciation is shown by a character navigation apparatus 20. 

senes. The construction of the navigation apparatus 20 con- 
In the case of this example, the latitude and longitude data 10 nected to the speech recognition apparatus 10 will next be 
every place name are set to latitude and longitude data explained. This navigation apparatus 20 has an antenna 21 
showing an absolute position of the seat of a government for a GPS. A signal for a position measurement from a 
office (a city office, a ward office, a town office, a village satellite for the GPS received by this antenna 21 is received 
office) in a region shown by its place name. The character and processed by a present position detecting circuit 22. The 
codes for display and data of a display scale are outputted as 15 present position of the navigation apparatus is detected by 
accompanying data together with the latitude and longitude analyzing these received data. Data of the detected present 
data. These data of the display scale are set to data of the position are latitude and longitude data in an absolute 
display scale set in accordance with the size of a region position at this time. 

shown by its place name. For example, these data of the xh e data of the detected present position are supplied to 

display scale are set to data designating the display scale at 20 an arithmetic circuit 23. This arithmetic circuit 23 functions 

several stages. as a system controller for controlling the operation of the 

The longitude and latitude data and their accompanying navigation apparatus 20. The arithmetic circuit 23 is con- 
data read from the ROM 17 for storing longitude and latitude nected to a CD-ROM driver 24, a RAM 25, a car speed 
converting data are supplied to an output terminal 10a as an sensor 26 and an operation key 27. In the CD-ROM driver 
output of the speech recognition apparatus 10, Data of the 25 24, a CD-ROM (an optical disk) storing road map data 
character code of an input speech detected as conformity by thereto is set and the CD-ROM driver 24 reads stored data 
the speech recognition circuit 14 are also supplied to the of this CD-ROM. The RAM 25 stores various kinds of data 
output terminal 10b as an output of the speech recognition required for data processing. The car speed sensor 26 detects 
apparatus 10. The obtained data of these output terminals the movement of the vehicle where this navigation apparatus 
10a and 106 are supplied to the navigation apparatus 20. 30 is mounted thereon. When longitude and latitude coordinate 

A talk switch 18 as an unlocked open-close switch data in the present position, etc. are obtained, the arithmetic 

(namely, a switch attaining a turning-on state only when the circuit 23 controls a reading operation for reading the road 

switch is pushed) is connected to the speech recognition ma P d ala m me vicinity of its coordinate position to the 

apparatus 10 in this example. When this talk switch 18 is CD-ROM driver 24. The arithmetic circuit 23 then makes 

continuously pushed for at least 300 msec or more, the me RAM 25 temporarily store the road map data read by the 

above processing is performed with respect to an audio CD-ROM driver 24 and makes display data for displaying a 

signal collected by the microphone 11 by circuits from the roa< ^ ma P by using these stored road map data. At this time, 

analog/digital converter 12 to a longitude latitude converting tnese display data are set to display data for displaying the 

circuit 16. This processing within the speech recognition 4Q map by a display scale (a reduced scale) set by an operation 

apparatus 10 is performed on the basis of control of the voice of tne operation key 27 arranged in a predetermined position 

recognition circuit 14 and the speech recognition circuit 14 within the car, etc. 

also judges a state of the talk switch 18. The display data made by the arithmetic circuit 23 are 

When it is judged that the talk switch 18 is again pushed meD supplied to a video signal generating circuit 28. A video 

for a predetermined period by the voice recognition circuit 45 s jS nal of a predetermined format is generated by this video 

14 while the above-mentioned processing is executed by the si S Dal generating circuit 28 on the basis of the display data, 

circuits from the analog/digital converter 12 to the longitude video si g nal is supplied to an output terminal 20c. 

latitude converting circuit 16, the processing executed at The video signal outputted from this output terminal 20c 

present within the speech recognition apparatus 10 is inter- is then supplied to a displaying apparatus 40 and image 

rupted and the speech recognition processing is restarted 50 receiving processing based on the video signal is performed 

with respect to an audio signal inputted for a period in which by this displaying apparatus 40. Thus, the road map, etc. are 

the talk switch 18 is newly pushed. displayed on a display panel of the displaying apparatus 40. 

Further, when the recognition processing of speech again In addition to the display of such a road map in the 

inputted is performed within a predetermined time (e.g., vicinity of the present position, a road map, etc. in a position 

within 10 seconds) in the speech recognition circuit 14, 55 designated by the operation of the operation key 27, etc. can 

speech previously recognized at this time is deleted from the be set to be displayed on the basis of the control of the 

temporary list of recognized speech stored to a memory arithmetic circuit 23. Further, a specific coordinate position 

within the speech recognition circuit 14, and it is judged such as "destination", "starting spot", "routing spot", "one's 

whether speech stored in a highest order of the deleted own house", etc. can be set to be registered on the basis of 

temporary list of recognized speech is recognized. When 60 me operation of the operation key 27, etc. When this specific 

such processing is continuously performed plural times coordinate position is registered, data (longitude and latitude 

(e.g., 5 times), data of the recognized speech as a candidate data) in this registered coordinate position are stored to the 

are read from the candidate list and are supplied to the RAM 25. 

navigation apparatus 20. The candidate list is displayed in When the car speed sensor 26 detects running of the car, 

the displaying apparatus 40 connected to the navigation 65 the arithmetic circuit 23 is set such that no operation except 

apparatus 20. Detailed contents of these processings will be for a relatively simple operation within the operation of the 

described later. operation key 27 is received. 
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This navigation apparatus 20 also has a self-contained driver 24 performs a reading control operation for reading 

navigation section 29. The navigation apparatus 20 calcu- road map data in the vicinity of this display position from a 

lates an exact running speed of the car on the basis of a pulse disk. 

signal corresponding to a car speed and supplied to the When data of a character code showing the pronunciation 

computer for engine control, etc. on a car side. The navi- 5 Q f a recognized speech are supplied from the speech recog- 

gation apparatus 20 also detects an advancing direction of mt j on apparatus 10 to the arithmetic circuit 23, a word 

the car on the basis of an output of a gyro sensor arranged shown by this character code is synthetically processed by 

within the self-contained navigation section 29. The navi- me speech synthesizer circuit 31 and is outputted as the 

gation apparatus 20 then measures the present position of the speech from the speaker 32 connected to the speech syn- 

car by the self-contained navigation from a position deter- 10 thesizer circuit 31. For example, when "Bunkyo ward, 

mined on the basis of the car speed and the advancing Tokyo Metropolis" is recognized as speech on a side of the 

direction. For example, when the present position detecting speech recognition apparatus 10, the speech synthesizer 

circuit 22 attains a state unable to detect the car position, the circuit 31 performs synthetic processing for generating an 

car position is measured by the self-contained navigation auc jio signal for pronouncing "Bunkyo ward, Tokyo 

from a car position finally detected by the present position 15 Metropolis" on the basis of data of a character series of this 

detecting circuit 22. recognized pronunciation. This generated audio signal is 

A speech synthesizer circuit 31 is also connected to the outputted from the speaker 32. 

arithmetic circuit 23. When any designation using speech is i n this case, when the speech is recognized by the voice 

required in the arithmetic circuit 23, the speech synthesizer recognition apparatus 10 in this example, longitude and 

circuit 31 executes synthesizer processing of this designated 20 latitude data are supplied to the terminal 20a of the navi- 

speech and the speech is set to be outputted from the speaker gat i orj apparatus 20 approximately simultaneously when the 

32 connected to the speech synthesizer circuit 31. For data of a character code showing the pronunciation of the 

example, various kinds of designations required for the recognized speech are supplied to the terminal 206. The 

navigation apparatus such as "Car approaches destination", arithmetic circuit 23 first executes processing for synthesiz- 

"Advancing direction is left", etc. are given through voices. 25 ^ a WOfd KC0 &iizcd by the speech synthesizer circuit 31 

Further, in this speech synthesizer circuit 31, speech recog- ^ speech, and next executes processing for making the 

nized by the speech recognition apparatus 10 is set to be display data of a road map based on the longitude and 

synthesized on the basis of supplied character data and be latitude data 

outputted as a voice from the speaker 32. This speech An operation of the speech recognition apparatus, etc. will 

synthesizer processing will be described later. 30 ^ foe expldned wnen a road map etc m ^_ 

Here, this navigation apparatus 20 has input terminals 20a formed by using the speech recognition apparatus 10 and the 

and 20b, The longitude and latitude data, their accompany- navigation apparatus 20 in this example. The flow chart of 

ing data and the data of a character code outputted from the FIG. 6 shows a speech recognizing operation performed by 

output terminals 10a and 10b of the voice recognition the speech recognition apparatus 10. In a step 101, it is first 

apparatus 10 are supplied to the input terminals 20a and 20b. judged whether the talk switch 18 is turned on or not. When 

The longitude and latitude data, their accompanying data it is judged that this talk switch 18 is turned on, an audio 

and the character code data obtained at these input terminals signal collected by the microphone 11 for a predetermined 

20a and 20b are supplied to the arithmetic circuit 23. period from this turning-on operation is sampled by the 

When these longitude and latitude data, etc. are supplied ^ analog/digital converter 12 and is processed by the digital 

from the speech recognition apparatus 10, the arithmetic voice processing circuit 13 and is changed to vector data 

circuit 23 performs a reading control operation for reading (step 102). Then, the speech recognition circuit 14 performs 

road map data in the vicinity of the longitude and latitude speech recognizing processing based on these vector data 

from a disk by the CD-ROM driver 24. Then, the arithmetic (step 103). 

circuit 23 makes the CD-ROM driver 24 temporarily store 45 Here, it is judged in a step 104 whether the speech of a 

the read road map data to the RAM 25 and makes display place name (namely, a place name registered in advance) 

data for displaying a road map by using these stored road stored to the ROM 15 for storing speech recognition data is 

map data. At this rime, the display data are set to display data recognized or not. When the speech of the registered place 

displayed with the supplied longitude and latitude as a center name is recognized, character data for pronouncing the 

and are also set to display data for displaying the map by a 5Q recognized place name are read from the ROM 15 and are 

scale (reduced scale) designated by the display scale accom- outputted from the output terminal 10b (step 105). Further, 

panied with the longitude and latitude data. longitude and latitude data of the recognized place name are 

A video signal is generated by the video signal generating read from the ROM 17 for storing longitude and latitude 
circuit 28 on the basis of these display data. The displaying converting data connected to the longitude latitude convert- 
apparatus 40 displays a road map in a coordinate position 55 ing circuit 16 (step 106). Here, in the speech recognition of 
designated from the speech recognition apparatus 10. the place name, place names registered in the ROM 15 in 

When the character code of a word for designating the ^is example are constructed by the names of domestic urban 
operation of the navigation apparatus is supplied from the and """^ prefectures, cities, wards, towns and villages, 
output terminal 10b of the speech recognition apparatus 10 Accordingly, for example, speech of "xx city, xx prefecture" 
and is discriminated by the arithmetic circuit 23, the arith- 60 and speech of "xx ward, xx city" (here, the speech can be set 
metic circuit 23 performs corresponding control. In this 10 recognized even when the names of urban and rural 
case, when this character code is the character code of a prefectures are omitted in the ward case) are recognized, 
word for designating a display position such as The longitude and latitude data read on the basis of the 
"destination", "starting spot", "routing spot", "one's own recognized speech and accompanying data thereof are out- 
house", etc., it is judged whether a coordinate in this display 65 putted from the output terminal 10a (step 107). 
position is registered to the RAM 25 or not. Thereafter, when When no speech of the registered place name can be 
this coordinate is registered to the RAM 25, the CD-ROM recognized in the step 104, it is judged in a step 108 whether 
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a specific registered speech except for the place name is 
recognized or not. Here, when the specific registered speech 
except for the place name is recognized, a character code 
corresponding to the recognized speech is judged (step 109) 
and is outputted from the output terminal 10b (step 110). 5 

In contrast to this, when no specific registered speech 
except for the place name can be recognized in the step 108, 
processing at this time is terminated. Otherwise, disability of 
the speech recognition is transmitted to the navigation 
apparatus 20. The navigation apparatus 20 then gives warn- 10 
ing by synthesized speech in the speech synthesizer circuit 

31 or characters, etc. displayed in the displaying apparatus 
40. 

Next, the flow chart of FIG. 7 shows the operation of the 
navigation apparatus 20. It is first judged in the arithmetic 35 
circuit 23 in a step 201 whether a display mode in the present 
position is set or not. When it is judged that the display mode 
in the present position is set, the present position detecting 
circuit 22 measures the present position (step 202). Road 
map data in the vicinity of the measured present position are 2 o 
read from the CD-ROM (step 203). Display processing of a 
road map based on these read road map data is performed 
and the road map in a corresponding coordinate position is 
displayed in the displaying apparatus 40 (step 204). 

In contrast to this, when it is judged in the step 201 that 25 
no display mode in the present position is set, or, when the 
display processing of the road map in the present position in 
the step 204 is terminated and a displaying state of this road 
map is set, it is judged in a step 205 whether longitude and 
latitude data, etc. are supplied from the speech recognition 30 
apparatus 10 through the input terminals 20a and 20b. Here, 
when it is judged that the longitude and latitude data and 
accompanying character data thereof, etc. are supplied, a 
character code for a pronunciation supplied through the 
terminal 20b is first supplied to the speech synthesizer 35 
circuit 31 and speech recognized by the speech recognition 
apparatus 10 is synthesized and outputted from the speaker 

32 (step 206). Subsequently, road map data in the vicinity of 
a position shown by the longitude and latitude data are read 
from the CD-ROM (step 207) and display processing of a 40 
road map based on these read road map data is performed. 
The road map in a corresponding coordinate position is then 
displayed in the displaying apparatus 40 (step 208). 

When it is judged in the step 205 that no longitude and 
latitude data are supplied from the speech recognition appa- 45 
ratus 10, or when display processing of the road map of a 
designated place name in the step 208 is terminated and a 
displaying state of this road map is set, it is judged in a step 
209 whether or not a character code for directly designating 
a display position is supplied from the speech recognition 50 
apparatus 10 through the input terminal 20b. When it is 
judged that the character code is supplied from the terminal 
20b, this character code is supplied to the speech synthesizer 
circuit 31 and speech recognized by the speech recognition 
apparatus 10 is outputted from the speaker 32 (step 210). 55 
Next, when the character code (namely, words of 
"destination", "starting spot", "routing spot", "one's own 
house", etc.) for directly designating the display position is 
discriminated in the step 209, it is judged in a step 211 
whether a coordinate position designated by these characters 60 
is registered to the RAM 25 or not. When this coordinate 
position is registered to the RAM 25, road map data in the 
vicinity of a position shown by the longitude and latitude 
data as the registered coordinate position are read from the 
CD-ROM (step 212). Then, display processing of a road 65 
map based on these read road map data is performed and a 
road map in the corresponding coordinate position is dis- 



521 

10 

played in the displaying apparatus 40 (step 213) and it is 
returned to the step 201 in this displaying state. 

When it is judged in the step 209 that no character code 
for directly designating the display position is supplied from 
the speech recognition apparatus 10, it is judged in the 
arithmetic circuit 23 in a step 214 whether or not there is an 
operation for designating the display position by operating 
the operation key 27. When there is an operation for 
designating this display position, it is judged in a step 215 
from detected data of the car speed sensor 26 whether the 
vehicle is running at the present time or not. When the 
arithmetic circuit 23 judges that the vehicle is running, the 
operation at this time is invalidated and it is returned to the 
step 201 (a certain warning may be given at this time). 

When it is judged that no vehicle is running, control goes 
from step 211. In the step 211, it is judged whether there is 
a registered coordinate or not. Thereafter, when there is a 
registered coordinate position, display processing of a road 
map in this position is performed in the steps 212, 213 and 
it is then returned to the step 201. 

In contrast to this, when no coordinate in a corresponding 
position such as "destination", "starting spot", "routing 
spot", "one's own house", etc. is registered in the step 211, 
an unregister warning is given by a synthesized speech in the 
speech synthesizer circuit 31 or display characters in the 
displaying apparatus 40 in a step 216 and it is then returned 
to the step 201. 

Processing relative to the map display is explained with 
reference to the flow chart of FIG. 7. However, when a 
character code is supplied from the speech recognition 
apparatus 10 as a result of the recognition of speech for 
designating an operation except for the map display, corre- 
sponding processing is performed on the basis of control of 
the arithmetic circuit 23. For example, when "what time 
now*', etc. are recognized and a character code is supplied, 
speech for pronouncing the present time is synthesized by 
the speech synthesizer circuit 31 on the basis of the control 
of the arithmetic circuit 23 and is outputted from the speaker 
32. The other commands are also processed such that a 
responsive speech is synthesized by the speech synthesizer 
circuit 31 and is outputted from the speaker 32, or a 
corresponding display is performed by the displaying appa- 
ratus 40. 

Since the above processing is performed, the display 
position can be freely set by speech input in any place in the 
whole country and a road map in a desirable position can be 
simply displayed. Namely, the speech of an operator is 
simply recognized when the operator speaks "xx city, xx 
prefecture" and "xx ward, xx city" toward the microphone 
11 while the operator pushes the talk switch 18. A road map 
in this region is also displayed. Accordingly, it is not 
necessary to designate a position by a key operation, etc. For 
example, the navigation apparatus can be operated even in 
a situation in which it is difficult to perform the key 
operation. In this case, the spoken place name recognized by 
the speech recognition apparatus 10 in this example is 
limited to the names of domestic urban and rural prefectures, 
cities, wards, towns and villages so that the number of 
recognized speech is limited to a relatively small number 
(about 3500). Accordingly, the place name can be recog- 
nized by speech recognition processing for a short time by 
a relatively small processing amount by the speech recog- 
nition circuit 14 within the speech recognition apparatus 10. 
Therefore, it is possible to shorten a time until a map 
designated in an inputted voice is displayed. Further, a 
recognition rate itself is also improved since the number of 
recognized place names is limited. 
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Here, the flow chart of FIG. 8A shows a processing recognized speech is limited to a relatively small number 
operation relative to a map display based on the speech (about 3500). Accordingly, the place name can be recog- 
recognition among the operations of the speech recognition nized by the speech recognition processing in a short time by 
apparatus 10 and the navigation apparatus 20 explained a relatively small processing amount by the speech re cog- 
above. This flow chart mainly shows processing when 5 nition circuit 14 within the speech recognition apparatus 10. 
speech is repeated. Therefore, it is possible to shorten a time until a map 

First, it is judged in step 301 whether a talk has started, designated in an inputted speech is displayed. Further, a 

e.g., the talk switch 18 is turned on or not. When it is judged recognition rate itself is also improved since the number of 

that talk has started, speech recognition processing of an recognized place names is limited. 

inputted audio signal is started in a step 302. Then, it is 10 In the case of this example, a series of characters of 

judged in a step 303 whether the talk has terminated, i.e., the recognized speech is outputted as speech by the speech 

talk switch 18 is turned off or not. When it is judged that talk synthesis in the speech synthesizer circuit 31 of the navi- 

has terminated, the speech recognition processing is per- gat ion apparatus 20. Accordingly, the operator can judge by 

formed with respect to all audio signals inputted at that time hearing the output speech whether the inputted speech is 

point in a step 304. Then, it is judged in a step 305 whether 15 correctly recognized or not. The operator can also immedi- 

the recognition processing is terminated or not. ately judge without actually seeing a displayed map whether 

Here, when it is judged that no recognition processing has lhe displayed map is a map in a correct region or not. 

terminated (i.e., while the recognition processing is continu- Accordingly, it is possible to prevent an error in operation 

ously performed), control goes from a step 306 and it is caused by an erroneous speech input, namely, an error in 

judged whether the talk has started (namely, whether the talk 20 operation in which a map in a region different from a place 

switch 18 is pushed) or not. When it is not judged that the designated in the speech is displayed, 

talk has started, it is returned to the step 304 and the In the case of this example, while speech is processed 

recognition processing is continuously performed. In con- within the speech recognition apparatus 10 after a region, 

trast to this, when it is judged in the step 306 that the talk is etc. are designated by this speech, there is a case in which 

started, the recognition processing till now has stopped in a 25 the talk switch 18 is again pushed to designate a region, etc. 

step 307. Then, it is returned to the step 302 and the by a new speech. In this case, the processing within the 

recognition processing is started by an audio signal newly speech recognition apparatus 10 is stopped and the recog- 

inputted. nition processing is executed on the newly inputted speech. 

When it is judged in the step 305 that the recognition Accordingly, it is a convenient process when there is an 

processing has terminated, control goes from a step 308 and m P ul error » etc - F° r example, when speech is inputted and 

output processing of the recognized speech from the speaker mere is ^ error m me names of urban and rural prefectures, 

32 by speech synthesis in the speech synthesizer circuit 31 ciues » wards, towns and villages, this error can be corrected 

is performed. Here, while this speech is outputted from the by again pushing the talk switch 18 and repeating the 

speaker 32, it is judged in a step 309 whether the talk is speech. Accordingly, the erroneous input can be very simply 

started (namely, whether the talk switch 18 is pushed) or not. corrected in comparison with an operating case in which the 

At this time, when it is judged that the talk is started, control kev 27 for designating various kinds of operations is oper- 

data for interrupting the speech synthesis processing in the atecl to designate the input error. In particular, the speech 

speech synthesizer circuit 31 are transmitted from the speech m P ut error 0411 De simply corrected even in a situation in 

recognition circuit 14 within the speech recognition appa- wnich il is difficult to perform a complicated key operation 

ratus 10 to the arithmetic circuit 23 within the navigation such as a situation during driving of a car. Therefore, it is 

apparatus 20 so that the speech synthesizer processing in the suitable for a navigation apparatus as in this example, 

speech synthesizer circuit 31 is stopped. Then, it is returned In this case, as shown in the flow chart of FIG. 8A, 

to the step 302 and the recognition processing is started by reprocessing using a reinput of this speech is valid while the 

an audio signal newly inputted. 45 recognized speech is outputted. Accordingly, for example, 

When it is not judged in the step 309 that the talk has when speech outputted from the speaker 32 is different from 
started, it is judged in a step 311 that the output of the the voice of a driver intending to talk (namely, when an error 
recognized speech by the speech synthesis in the speech m recognition is caused in the voice recognition circuit 14), 
synthesizer circuit 31 is terminated. Thereafter, a map dis- the recognition processing can be executed again by repeat- 
play in a corresponding position is executed in the display- 50 i°g speech of the same region name. Therefore, it is possible 
ing apparatus 40 in a step 312 and processing of the map t0 cope with a recognized error by a simple operation, 
display by the speech recognition is terminated. In the flow chart of FIG. 8 A, a reinput is received while 

Since the above display processing is performed, the &e recognized speech is outputted from the speaker 32 by 

display position can be freely set by speech input in any speech synthesis in the speech synthesizer circuit 31. 

place in the whole country and a road map in a desirable 55 However, while map data are read and a road map is 

position can be simply displayed. Namely, the speech of an displayed in the displaying apparatus 40 after the output of 

operator is simply recognized when the operator speaks "xx this speech, the map display may be interrupted when the 

city, xx prefecture" and "xx ward, xx city** toward the talk is started and the recognition processing may be started 

microphone 11 while the operator pushes the talk switch 18. for the newly inputted speech. 

A road map in this region is also displayed. Accordingly, it 60 In the case of this example, data of a coordinate position 

is not necessary to designate a position by a key operation, corresponding to a place name stored to the ROM 17 within 

etc. For example, the navigation apparatus can be operated the speech recognition apparatus 10 are set to latitude and 

even in a situation in which it is difficult to perform the key longitude data showing an absolute position in the seat of a 

operation. In this case, the spoken place name recognized by government office (a city office, a ward office, a town office, 

the speech recognition apparatus 10 in this example is 65 a village office) in its region. Accordingly, a map with the 

limited to the names of domestic urban and rural prefectures, government office as a center in its region is displayed so 

cities, wards, towns and villages so that the number of th at a preferable display state is set. Namely, the government 
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office in each region is located in a central portion of this tion apparatus 20 as a recognized result and are outputted as 

region relatively in many cases. Accordingly, the possibility speech from the speaker 32 in a step 307. When this 

of a most preferable display form is high. recognized result is a speech indicating a region (i.e., the 

In this case, the scale (reduced scale) of a display map is n "? e V >f «« ban L and ™"J Prefectures, cities, waro\ towns 

set to a display scale shown by accompanying data stored in 5 ^ villages m the case of thus example), a map for djsplay- 

the ROM 17 Accordingly, for example' it is possible to m 8 hese clUes > T^nT* i ° d ^ ? * 

,. - - "p 6 , j- , • i. r displaying apparatus 40 by processing within the navigation 

provide a display form for approximately displaying all of a apf^ 20 in a step 308. At ttustiml, the recognized result 

region designated in a speech at this time so that a preferable ^ added lQ the temporary list of recognized spee ch in a step 

display can be provided- This display scale may be set to a 309 md it u returned t0 me step 302 and waits for the next 

predetermined scale fixed at any time. For example, varying 10 talking to start 

and fixing settings of this display scale may be switched by ln such COQtroU wheQ ^ taJk ^ rontinuously carried oul 

setting a mode. wUhm a constant time (e.g., within 10 seconds), it is 

In the case of this example, a speech ("destination", considered as a repeated talk so that a first candidate of the 

"starting spot", "routing spot", "one's own house", etc.) for ^ previous recognized result is removed from recognized 

specifying a place except for a place name can be also object words. Accordingly, an incorrect place name is again 

recognized by the speech recognition apparatus 10. recognized by carrying out the repeated talk so that an 

Accordingly, a display position can be direcdy set to a unrecognizing situation of a desirable place name can be 

registered position by performing this designation through prevented. For example, "Kanagawa ward, Yokohama city" 

the speech. In this case, it is not necessary to judge coordi- a nd "Kanazawa ward, Yokohama city" exist as similar place 

nate data within the speech recognition apparatus 10 so that names. When a person inputting speech talks "Kanagawa 

processing of the speech recognition apparatus 10 can be ward, Yokohama city", it is assumed that "Kanazawa ward, 

correspondingly performed rapidly. Yokohama city" is incorrectly recognized. At this time, 

In the case of this example, when the names of cities, when no countermeasures are taken by repeating the same 
wards, towns and villages are recognized as voices by the 25 pronunciation, there is a high possibility that "Kanazawa 
voice recognition apparatus 10, it is recognized as the same ward, Yokohama city" is again recognized incorrectly, 
place name in both the pronouncing cases of "machi" and However, here, there is already the pronunciation of 
"son" and the pronouncing cases of "cho" and "mura" with "Kanazawa ward, Yokohama city" in the temporary list of 
respect to "town" and village". Accordingly, the place name recognized speech at a second speech inputting time, 
itself can be correctly recognized even when the pronuncia- 3Q Accordingly, this "Kanazawa ward, Yokohama city" is 
tions of "town" and "village" are incorrect, thereby improv- removed from the recognized object words. When there is 
ing the recognition rate correspondingly. Further, the names "Kanagawa ward, Yokohama city" as a second candidate, 
of cities, wards, towns and villages tending to be mistaken this "Kanagawa ward, Yokohama city" is moved up as the 
with respect to the names of urban and rural prefectures can first candidate. Thus, it is judged that "Kanagawa ward, 
be also recognized correctly even when these names of the 35 Yokohama city" is recognized. As a result, when the res- 
urban and rural prefectures are mistaken, thereby further peaking is carried out, the error in recognition is prevented 
improving the recognition rate. so that recognition rate can be correspondingly improved. 

Further, in this example, when a speech is inputted and its Here, it is limited to the case of the repeated talk within 

recognition processing is performed and the speech is then a constant time of about 10 seconds. Accordingly, the 

inputted again, past recognized results are referred at a ^ recognition can be performed except that the repeated talk is 

recognition processing time of this re input ted speech. immediately carried out with respect to a word once 

Hereafter, this processing is shown in the flow chart of FIG. removed from the recognized object words. Therefore, no 

8B. recognition rate is reduced from this point. 

First, when a sufficient time (e.g., several minutes) has In this example, when such a repeated talk is repeated, it 

passed from a previous speech recognition processing, a 45 is judged that it is difficult to exactly perform the recognition 

temporary list of recognized speech within the speech rec- by only the speech input. Then, data of the candidate list at 

ognition circuit 14 is cleared in a step 301. Thereafter, it is this time are supplied to the navigation apparatus 20. A video 

judged in a step 302 whether talk has started, i.e., the talk signal for displaying candidates having a recognizing pos- 

switch 18 is turned on or not. When it is judged that the talk sibility within the navigation apparatus 20 as a list is made 

is started, it is further judged in a step 303 whether a 50 and this list is displayed in the displaying apparatus 40. 

predetermined time Th (here, 10 seconds) has passed from The flow chart of FIG. 9 shows processing in this case, 

a previous talk or not. When the predetermined time has First, when a sufficient time (e.g., several minutes) has 

passed, the temporary list of recognized within the speech passed from previous voice recognition processing, the 

recognition circuit 14 is cleared in a step 304. In contrast to temporary list of recognized speech within the speech rec- 

this, when no predetermined time Th has passed from the 55 ognition circuit 14 is cleared in a step 401. Thereafter, it is 

previous talk, no temporary list of recognized speech is judged in a step 402 whether a talk has started, namely, the 

cleared. talk switch 18 is turned on or not. When it is judged that the 

The recognition processing of an inputted speech is next talk has started, it is further judged in a step 403 whether a 

performed on the basis of control of the voice recognition predetermined time Th (here, 10 seconds) has passed from 

circuit 14 in a step 305. Speech data of a candidate obtained 60 a previous talk or not. When the predetermined time has 

by this recognized result is collated with speech data in the passed, the temporary list of recognized speech within the 

temporary list of recognized speech. When there are the speech recognition circuit 14 is cleared in a step 404. In 

same data in the temporary list of recognized speech, these contrast to this, when no predetermined time Th has passed 

data are deleted from the recognized candidate in a step 305. from the previous talk, no temporary list of recognized 

Data having a highest recognition degree (conformity 65 speech is cleared. 

degree) among the remaining candidate data at this time are The recognition processing of an inputted speech is next 

supplied to the speech synthesizer circuit 31 of the naviga- performed on the basis of control of the speech recognition 
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circuit 14 in a step 405. Speech data of a candidate obtained displaying a road map in a corresponding position is made 

by this recognized result is collated with speech data in the on the basis of the supplied longitude and latitude data and 

temporary list of recognized speech. When there are the the map of the selected candidate is displayed in the dis- 

same data in the temporary list of recognized speech, these playing apparatus 40 in a step 416. The selected result at this 

data are deleted from the recognized candidate in a step 406. 5 time is added to the temporary list of recognized speech in 

Next, it is judged in a step 407 whether or not the number a step 417 and it is returned to the step 402 and waits for the 

of items of the temporary list of recognized speech is equal next talk to start. 

to or greater than N (here, 5). When the number of items is When it is judged in the step 414 that no button for 

not equal to or greater than N (i.e., when no talk is continu- determination is pushed, it is further judged in a step 418 

ously carried out N times), control goes from a step 408. whether talk has started after that, namely, the talk switch 18 

Then, data having a highest recognition degree (conformity is turned on or not. When it is judged that talk has started, 

degree) among the remaining candidate data at this time are the display of the candidate list is stopped and it is returned 

supplied to the speech synthesizer circuit 31 of the naviga- to the processing in the step 403. In contrast to this, when it 

tion apparatus 20 as a recognized result and are outputted as is judged in the step 418 that no talk has started, it is further 

speech from the speaker 32. When the recognized result is 5 judged in a step 419 whether or not a predetermined time Td 

speech showing a region (namely, the names of urban and (e.g., about 10 seconds) has passed from starting of the 

rural prefectures, cities, wards, towns and villages in the display of the candidate list in the step 411. When this time 

case of this example), a map for displaying these cities, Td has not passed, it is returned to the processing in the step 

wards, towns and villages is displayed in the displaying 412 and a display state of the candidate list is continued. In 

apparatus 40 by processing within the navigation apparatus 2Q contrast to this, when it is judged in the step 419 that the 

20 in a step 409. Then, the recognized result at this time is predetermined time Td has passed, it is further judged in a 

added to the temporary list of recognized speech in a step step 420 whether the scroll operation is performed in the step 

410 and it is returned to the step 402 and waits for the next 412 or not. When the scroll operation is performed, it is 

talk to start. returned to the processing in the step 412 and the display 

When it is judged in the step 407 that the number of items 25 state of the candidate list is continued, 

of the temporary list of recognized speech is N (i.e., when In contrast to this, when it is judged in the step 420 that 

the talk is continuously carried out N times), control goes no scroll operation is performed, control goes from the step 

from a step 411 and display processing of the candidate list 408 and a first order result of the candidate list is outputted 

is performed. Namely, candidate data recognized in the as speech and a map of this first order place name is 

recognition processing up to now are read from a memory 30 displayed. 

for the candidate list within the speech recognition circuit 14 Thus, when the speech input is repeated predetermined 

and are supplied to the navigation apparatus 20. Then, a times (here, 5 times) for a short time, recognized object 

video signal of the candidate list is generated by the video words recognized by a continuous input audio signal at this 

signal generating circuit 28 within the navigation apparatus time are displayed as a list in a high recognition degree 

20. This video signal is supplied to the displaying apparatus 35 order. Accordingly, a recognizing state at this time can be 

40 so that the candidate list is displayed in the displaying easily judged and a word can be selected from the displayed 

apparatus 40. list. Therefore, it is possible to easily cope with a difficult 

For example, the candidate list at this time is displayed as case of the recognition using the speech input by means of 

shown in FIG. 10. Namely, candidates such as about first to a simple operation. 

fifth candidates are displayed in the order of a highest 40 In the explanation of the flow charts of FIGS. 8B and 9, 

conformity degree (lower candidates may be also displayed the selected candidate is a place name and a map is displayed 

by a scroll operation, etc.). At this time, candidates for place on the basis of this place name. However, when the selected 

names and commands are set to be displayed in different candidate is certain commands, corresponding commands 

forms (e.g., display colors of characters are changed). In the are executed instead of the map display, 

example of FIG. 10, these candidates are displayed in 45 i n me a bove processing in the step 306 shown in FIG. 8B 

different character forms. and lhe step 495 sa0 wn in FIG. 9, contents are simply 

A mark "a" showing a selection is given to the first deleted from the candidate list from the recognized result, 

candidate among the candidates within this candidate list at However, in the case of certain commands, its word may not 

a first display stage of this candidate list. This mark "a" be removed from the candidate list. Namely, processing in 

showing the selected candidate can be moved by the scroll 50 steps 501 and 502 shown in FIG. 11 may be performed 

operation performed by the operation key 27. Next, it is instead of steps 306 and 406. Namely, it is judged in a step 

judged in a step 412 whether this scroll operation is per- 501 whether the recognized result is a place name or not. In 

formed or not. Here, when the scroll operation is performed, the step 502, contents in the hysteresis list are deleted from 

a position of the mark "a" given to the selected candidate is candidates of the recognized result only when it is judged as 

moved in a step 413. 55 a place name in the step 501. When the recognized result is 

In this state, it is judged in a step 414 whether a button for commands, etc. except for the place name, control goes from 

determination within the operation key 27 is pushed or not. the next step without the deletion in the step 502. Thus, in 

When it is judged that this button for determination is the case of speech for giving any commands, this speech is 

pushed, it judged that the candidate shown by the mark a is repeatedly recognized every time this speech is inputted, 

selected. Then, reading of data (longitude and latitude data, 60 thereby executing a corresponding operation. For example, 

character data for a speech output, etc.) relative to this "what time now" is recognized in a speech and this result is 

candidate is commanded by the speech recognition appara- outputted as "It is xx o'clock, xx minute** in a speech. In this 

tus 10. These read data are supplied to the navigation case, when the speech about this time is missed in hearing, 

apparatus 20. The speech synthesis processing is performed there is a case in which "what time now" is continuously 

on the basis of these supplied data in the speech synthesizer 65 recognized again in a speech. In this case, this result is also 

circuit 31 so that a place name is outputted as speech from outputted again as "It is xx o'clock, xx minute" in a speech 

the speaker 32 in a step 415. Then, a video signal for so that corresponding control is preferably performed. 
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When the candidate list is displayed as a table as shown cope with a case in which an error in recognition is known 

in FIG. 10, recognized object words shown in this list may by the output of the recognized speech, by means of only the 

be sequentially outputted as speech from the speaker 32 by speech reinput. 

the speech synthesis processing in the speech synthesizer When the recognized voice is not correctly recognized 

circuit 31. Thus, candidates for the recognized object words 5 when first spoken, but is respoken, a recognized object word 

are known without seeing the display of the displaying first recognized in error is removed from the candidates for 

apparatus 40 so that handling of the navigation apparatus is recognized object words recognizable in a voice processing 

improved section. Accordingly, correct recognition possibility is 

T . , . , . , . 4 . increased so that recognition rate can be substantially 

In the above embodiment, place names recognized by the , °^ J 

speech recognition apparatus are limited to the names of JO P * . . 

j u j i * — *.* a . 1° this case, when recognized object words showing a 

domestic urban and rural prefectures, cities, wards, towns c aQd de f ermined ^ ommand ^ ^ 

and villages. However, more detailed place names, the £ ' ^ speech and the recognized object word of a 

names of target objects, etc. may be ako recognized. In this reviously recognized speech is a speech showing the pre- 

case, when the number of recognizable place names, etc. is determined command, this recognized object word is not 

increased, a processing amount and a processing time 15 rcmoved from lhe recognized object words able to be 

required for the speech recognition are correspondingly recognized as speech. Accordingly, the removing processing 

increased. Therefore, it is most preferable to limit the fr 0m candidates is performed only in the case of a speech 

number of recognizable place names to about the number of showing a region. Therefore, it is possible to prevent an error 

names of cities, wards, towns and villages so as to improve m operation when the same command is repeatedly inputted 

a recognition rate. 20 as speech. 

In the above embodiment, a central coordinate every place When the speech is repeatedly inputted a predetermined 

name is set to latitude and longitude data showing an number of times, recognized object words recognized by a 

absolute position in the seat of a government office (a city continuous input audio signal at this time are displayed as a 

office, a ward office, a town office, a village office) in its list in a high recognition degree order and are selected from 

region, but may be set to latitude and longitude data showing 25 this display. Accordingly, it is possible to simply cope with 

another position. For example, the central coordinate may be me case of a repetitious error in recognition, 

simply set to latitude and longitude data of a center of its Irj accord ance with the speech recognition method of the 

region (a city, a ward, a town, a village). presem invention, when a new audio signal is inputted 

Further, data in the coordinate positions of end portions of ^ during the execution of processing of an inputted audio 

east, west, south and north in its region may be stored signal, the executed processing is interrupted and the pro- 

instead of such central latitude and longitude data. In this cessing of the newly inputted audio signal is executed, 

case, it is sufficient if there are four data of east and west Accordingly, for example, when a specific region is desig- 

longitudes and south and north latitudes. nated in speech and a place name, etc. are incorrectly 

In the above embodiment, a recognized speech is con- 35 inputted, processing subsequent to the recognition process- 
verted to a character code by the speech recognition appa- ing by a correct speech are executed only by respeaking a 
ratus 14 within the speech recognition circuit, and this correct place name. Therefore, it is possible to simply cope 
character code is converted to longitude and latitude data by with an input error time, etc. without performing a compli- 
the longitude latitude converting circuit 16. However, the cated key operation for canceling the input, etc. 
recognized speech may be directly converted to longitude ^ In this speech recognition method, when another speech 
and latitude data. When no recognized speech is directly is again inputted during the output of a recognized speech, 
converted to the longitude and latitude data, the ROM 15 output processing of the speech is interrupted and discrimi- 
and the ROM 17 for storing these converted data may be nating processing of the reinputted speech is performed, 
constructed by the same memory such that, for example, the Accordingly, it is possible to simply cope with a case in 
memory area of a place name is commonly used. 45 which an error in recognition is known by the output of the 

In the above embodiment, the present invention is applied recognized speech, by means of only the speech reinput. 

to a navigation apparatus using a position measuring system Further, in accordance with the speech recognition 

called a GPS. However, the present invention can be also method of the present invention, when the recognized 

applied to a navigation apparatus using another position speech is not correctly recognized when first spoken, but is 

measuring system. 50 respoken, a recognized object word first recognized in error 

In accordance with the speech recognition apparatus of is removed from the candidates for recognized object words 

the present invention, when a new audio signal is inputted able to be recognized. Accordingly, correct recognition 

during the execution of processing of an inputted audio possibility is increased. 

signal, the executed processing is interrupted and the pro- In accordance with the navigation apparatus of the present 

cessing of the newly inputted audio signal is executed. 55 invention, when a new audio signal is inputted during the 

Accordingly, for example, when a specific region is desig- execution of processing for a map display by an inputted 

nated in a speech and a place name, etc. are incorrectly audio signal, the executed processing is interrupted and the 

inputted, processing subsequent to the recognition process- map display processing is executed by the newly inputted 

ing by a correct speech are executed only by respeaking a audio signal. Accordingly, for example, when a specific 

correct place name. Therefore, it is possible to simply cope 60 region is designated in a speech and a place name, etc. are 

with an input error, etc. without performing a complicated incorrectly inputted, a region name by a correct speech is 

key operation for canceling the input, etc. recognized only by respeaking a correct place name, and a 

Further, in this speech recognition apparatus, when niap in a correct position is displayed. Therefore, it is 

another speech is again inputted during the output of a possible to simply cope with a case in which a speech is 

recognized speech, output processing of the speech is inter- 65 inputted in error, by means of h speech, 

rupted and discriminating processing of the reinputted In this navigation apparatus, when another speech is again 

speech is performed. Accordingly, it is possible to simply inputted during the output of a recognized speech, output 
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processing of the speech is interrupted and discriminating output processing of the speech is interrupted and discrimi- 

processing of the re inputted speech is performed. nating processing of the reinputted speech is performed. 

Accordingly, it is possible to simply cope with a case in Accordingly, it is possible to simply cope with a case in 

which an error in recognition is known by the output of the which an error in recognition is known by the output of the 

recognized speech, by means of only the speech reinput. 5 recognized speech, by means of only the speech reinput. 

Further, in accordance with the navigation apparatus of Further, in accordance with the navigation method of the 

the present invention, when the recognized speech is not Qt i nvent ion, when the recognized speech is not cor- 

correctly recognized when first spoken, but is respoken, a recd r ^ when first k blU ^ respoken) a 

recognized object word first recognized in error * removed ^ object word ^ in error * removed 

from the candidates for recognized object words able to be 1Q £ ^ for rGC o&i£d object words ab!e to be 

recognized. Accordingly, the possibility of an operation of . , A , ., * , . . 

the navigation apparafii such Z a correct map display, etc. recognized. Accordingly, the possibility of a navigating 

by correct recognition is increased. operation such as a correct map display, etc. by correct 

In this case, when recognized object words showing a recognition is increased, 

specific region and a predetermined command are prepared Further, in accordance with the car of the present 

as a recognized speech and the recognized object word of a 15 invention, when a new audio signal is inputted during the 

previously recognized speech is a speech showing the pre- execution of processing for a map display using a display 

determined command, this recognized object word is not means within the car by an inputted audio signal, the 

removed from the recognized object words able to be executed processing is interrupted and the map display 

recognized as speech. Accordingly, the removing processing processing is executed by the newly inputted audio signal, 

from candidates is performed only in the case of a speech 20 Accordingly, for example, when a specific region is desig- 

showing a region. Therefore, it is possible to prevent an error nated in a speech and a place name, etc. are incorrectly 

in operation of the navigation apparatus when the same inputted, a region name by a correct speech is recognized 

command is repeatedly inputted as speech. only by respeaking a correct place name, and a map in a 

When the speech is repeatedly inputted a predetermined correct position is displayed in the display means within the 

number of times, recognized object words recognized by a 25 car. Therefore, it is possible to simply cope with a case in 

continuous input audio signal at this lime are displayed as a which a speech is inputted in error, without obstructing 

list in a high recognition degree order so that a recognizing driving of the car, etc. 

state at that time can be easily judged. Further, in accordance with the car of the present 

The recognized object words displayed as a list are invention, when a map is displayed on the basis of the 

sequentially outputted as speech from the above speech recognition of an inputted speech and the recognized speech 

output section. Accordingly, the recognized object words are is not correctly recognized when first spoken, but is 

known in speech without seeing a displaying state. respoken, a recognized object word first recognized in error 

Further, a selection can be made from the recognized is removed from the candidates for recognized object words 

object words displayed as a list on the basis of the operation 35 able to be recognized. Accordingly, the possibility of a 

of a predetermined operating means. Accordingly, when correct map display by correct recognition is increased, 

there is a repetitious error in recognition, a required recog- Having described a preferred embodiments of the inven- 

nized object word can be simply searched and a map display tion with reference to the accompanying drawings, it is to be 

by this word, etc. can be performed. understood that the invention is not limited to those precise 

When no operation of the predetermined operating means ^ embodiments and that various changes and modifications 

is performed within a predetermined time in a state dis- could be effected therein by one skilled in the art without 

played as a list, the recognized object word of a candidate departing from the spirit or scope of the invention as defined 

having a highest recognition possibility is automatically in the appended claims, 

selected from the recognized object words displayed as a What is claimed is: 

list. Accordingly, the list display is switched to the map 45 1. A map displaying apparatus comprising: 

display in a suitable form. map data memory means for storing map data; 

Further, when an audio signal is inputted to an audio display means for displaying a map; and 

signal input means during the list display, the speech rec- a speech recognition unit for performing speech 

ognition processing of this inputted audio signal is per- recognition, wherein said speech recognition unit 

formed in a speech processing section. Accordingly, it is 50 includes: 

possible to cope with a case in which a speech is again sound i nput means f or entering a sound signal; 
inputted by confirming the list display. Therefore, an opera- speech recognition means for recognizing spoken 
tion using the speech input can be preferably performed. wortls in sa j d entered sound signal and for convert- 
Further, in accordance with the navigation method of the ing data thereof; and 
present invention, when a new audio signal is inputted 55 control means for controlling operation of said speech 
during the execution of processing for a map display by an recognition means, 

inputted audio signal, the executed processing is interrupted wherein when a second sound signal is entered while 

and the map display processing is executed by the newly sa id speech recognition means is recognizing spoken 

inputted audio signal. Accordingly, for example, when a words in or converting data of a first sound signal, 

specific region is designated in a speech and a place name, 60 said control means cancels said recognizing or said 

etc. are incorrectly inputted, a region name by a correct converting of said speech recognition means in order 

speech is recognized only by respeaking a correct place to allow speech recognition to be performed on said 

name, and a map in a correct position is displayed. second sound signal. 

Therefore, it is possible to simply cope with a case in which 2. The map displaying apparatus as claimed in claim 1, 

a speech is inputted in error, by means of speech. 65 wherein said control means executes an operation based on 

Further, in this navigation method, when another speech whether a word recognized by said speech recognition 

is again inputted during the output of a recognized speech, means is a command or a geographical name. 
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3. A map displaying apparatus comprising: 
map data memory means for storing map data; 
display means for displaying a map; and 
a speech recognition unit for performing speech 

recognition, wherein said speech recognition unit 
includes: 

sound signal input means for entering a sound signal; 

speech recognition means for recognizing spoken 
words in said entered sound signal and for convert- 
ing data thereof; 

sound generating means for generating sound; and 

control means for controlling operation of said speech 
recognition means and said sound generating means, 

wherein when a second sound signal is entered while 
said sound generating means is generating sound 
related to a first sound signal, said control means 
stops said speech recognition means from 
recognizing, converting, or generating sound related 
to said first sound signal to allow said speech rec- 
ognition unit to perform speech recognition on said 
second sound signal. 

4. A map displaying apparatus comprising: 
map data memory means storing map data; 
display means for displaying a map; and 
a speech recognition unit for performing speech 

recognition, wherein said speech recognition unit 
includes: 

sound signal input means for entering a sound signal; 

speech recognition means for recognizing spoken 
words in said entered sound signal and for convert- 
ing data thereof; and 

control means for controlling operation of said speech 
recognition means, 

wherein when a second sound signal is entered within 
a predetermined time after a first sound signal is 
entered, said control means controls said speech 
recognition means to select a word previously 
selected from a group of words available for selec- 
tion by said speech recognition means. 

5. The map displaying apparatus as claimed in claim 4, 
wherein said control means controls said speech recognition 
means to select said word when said word is recognized as 
a geographical name and to not select said word when said 
word is recognized as a command. 

6. A navigation apparatus comprising: 
position detecting means for detecting a present position; 
map data memory means for storing map data; 
display means for displaying a map; and 
a speech recognition unit for performing speech 

recognition, wherein said speech recognition unit 
includes: 

sound signal input means for entering a sound signal; 

speech recognition means for recognizing spoken 
words in said entered sound signal and for convert- 
ing data thereof; and 

control means for controlling operation of said speech 
recognition means, 

wherein when a second sound signal is entered while 
said speech recognition means is recognizing spoken 60 
words in or converting data of said first sound signal, 
said control means cancels said recognizing or con- 
verting of said speech recognition means in order to 
allow speech recognition to be performed on said 
second sound signal. 

7. The navigation apparatus as claimed in claim 6, 
wherein said control means executes an operation based on 
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whether a word recognized by said speech recognition 
means is a command or a geographical name. 

8. A navigation apparatus comprising: 

position detecting means for detecting a present position; 
map data memory means for storing map data; 
display means for displaying a map; and 
a speech recognition unit for performing speech 

recognition, wherein said speech recognition unit 

includes: 

sound signal input means for entering a sound signal; 

speech recognition means for recognizing spoken 
words in said entered sound signal and for convert- 
ing data thereof; 

sound generating means for generating sound; and 

control means for controlling operation of said speech 
recognition means and said sound generating means, 

wherein when a second sound signal is entered while 
said sound generating means is generating sound 
relating to a first sound signal, said control means 
stops said speech recognition means from said 
recognizing, converting, or generating sound related 
to said first sound signal to allow said speech rec- 
ognition unit to perform speech recognition on said 
second sound signal. 

9. A navigation apparatus comprising: 

position detecting means for detecting a present position: 
map data memory means for storing map data; 
display means for displaying a map; and 
a speech recognition unit for performing speech 

recognition, wherein said speech recognition unit 

includes: 

sound signal input means for entering a sound signal; 

speech recognition means for recognizing spoken 
words in said entered sound signal and for con- 
verting data thereof; and 

control means for controlling operation of said 
speech recognition means, 

wherein when a second sound signal is entered 
within a predetermined time after a first sound 
signal is entered, said control means controls said 
speech recognition means to select a word previ- 
ously selected from a group of words available for 
selection by said speech recognition means. 

10. The navigation apparatus as claimed in claim 9, 
wherein said control means controls said speech recognition 
means to select said word when said word is recognized as 
a geographical name and to not select said word when said 
word is recognized as a command. 

11. A navigation method comprising: 

a position detecting step for detecting a present position; 
a map data reading step for reading map data from a 

storage device; 
a display step for displaying said map data; and 
a speech processing step for performing speech 

processing, wherein said speech recognition step 

includes: 

a sound signal input step for entering a sound signal; 
a speech recognition step for recognizing spoken words 

in said entered sound signal and for converting data 

thereof; and 

a control step for controlling operation of said speech 
recognition step, 

wherein when a second sound signal is entered while 
recognizing spoken words in or converting data of a 
first sound signal in said speech recognition step, 
said control step cancels said recognizing or said 
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converting of said speech recognition step in order to 
allow speech recognition to be performed on said 
second sound signal. 

12. A navigation method comprising: 

a position detecting step for detecting a present position; 5 
a map data reading step for reading map data from a 

storage device; 
a display step for displaying said map data; and 
a speech processing step for performing speech 3 q 

processing, wherein said speech processing step 

includes: 

a sound signal input step for entering a sound signal; 
a speech recognition step for recognizing spoken words 

in said entered sound signal and for converting data 15 

thereof; 

a sound generating step for generating sound; and 
a control step for controlling operation of said speech 

recognition step and said sound generating step, 
wherein when a second sound signal is entered while 2 o 
sound related to a first sound signal is being gener- 
ated in said sound generating step said control step 
cancels said recognizing, said converting, and said 
sound generating related to said first sound signal in 
order to allow speech recognition to be performed on 2 s 
said second sound signal. 

13. A car with a navigation feature, said car comprising: 
a car body; 

steering means for steering said car body; 

position detecting means for detecting a present position 30 

of said car body; 
map data memory means for storing map data; 
display means for displaying a map; and 
a speech recognition unit for performing speech 35 

recognition, wherein said speech recognition unit 

includes: 

sound signal input means for entering a sound signal; 



speech recognition means for recognizing spoken 
words in said entered sound signal and for convert- 
ing data thereof; and 

control means for controlling operation of said speech 
recognition means, 

wherein when a second sound signal is entered while 
said speech recognition means is recognizing spoken 
words in or converting data of a first sound signal, 
said control means cancels said recognizing or said 
converting by said speech recognition means in order 
to allow speech recognition to be performed on said 
second sound signal. 
14. A car with a navigation feature, said car comprising: 
a car body; 

steering means for steering said car body; 

position detecting means for detecting a present position 

of said car body; 
map data memory means for storing map data; 
display means for displaying a map; and 
a speech recognition unit for performing speech 

recognition, wherein said speech recognition unit 

includes: 

sound signal input means for entering a sound signal; 

speech recognition means for recognizing spoken 
words in said entered sound signal and for convert- 
ing data thereof; 

sound generating means for generating sound; and 

control means for controlling operation of said speech 
recognition means and said sound generating means, 

wherein when a second sound signal is entered while 
said sound generating means is generating sound 
related to a first sound signal, said control means 
cancels said speech recognition unit from perform- 
ing speech recognition and generating sound based 
on said first sound signal in order to allow speech 
recognition to be performed on said second sound 
signal. 
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